Balanced-MixUp for Highly Imbalanced Medical Image Classification
نویسندگان
چکیده
Highly imbalanced datasets are ubiquitous in medical image classification problems. In such problems, it is often the case that rare classes associated to less prevalent diseases severely under-represented labeled databases, typically resulting poor performance of machine learning algorithms due overfitting process. this paper, we propose a novel mechanism for sampling training data based on popular MixUp regularization technique, which refer as Balanced-MixUp. short, Balanced-MixUp simultaneously performs regular (i.e., instance-based) and balanced class-based) data. The two sets samples then mixed-up create more distribution from neural network can effectively learn without incurring heavily under-fitting minority classes. We experiment with highly dataset retinal images (55K samples, 5 classes) long-tail gastro-intestinal video frames (10K images, 23 classes), using CNNs varying representation capabilities. Experimental results demonstrate applying outperforms other conventional schemes loss functions specifically designed deal Code released at https://github.com/agaldran/balanced_mixup .
منابع مشابه
Predictive Data Mining for Highly Imbalanced Classification
The paper addresses some theoretical and practical aspects of data mining, focusing on predictive data mining, where two central types of prediction problems are discussed: classification and regression. Further accent is made on predictive data mining, where the time-stamped data greatly increase the dimensions and complexity of problem solving. The main goal is through processing of data (rec...
متن کاملAn Effective Approach for Imbalanced Classification: Unevenly Balanced Bagging
Learning from imbalanced data is an important problem in data mining research. Much research has addressed the problem of imbalanced data by using sampling methods to generate an equally balanced training set to improve the performance of the prediction models, but it is unclear what ratio of class distribution is best for training a prediction model. Bagging is one of the most popular and effe...
متن کاملA Prediction for Classification of Highly Imbalanced Medical Dataset Using Databoost.IM with SVM
Recently, Class imbalance problems have growing interest because of their classification difficulty caused by the imbalanced class distributions. In particular, many ensemble learning and machine learning methods have been proposed for classification of imbalance problem. However, these methods producing poor predictive accuracy of classification for two-class imbalanced dataset. In this paper,...
متن کاملOn Mining Fuzzy Classification Rules for Imbalanced Data
Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...
متن کاملClassification of Imbalanced Marketing Data with Balanced Random Sets
With imbalanced data a classifier built using all of the data has the tendency the ignore the minority class. To overcome this problem, we propose to use an ensemble classifier constructed on the basis of a large number of relatively small and balanced subsets, where representatives from both patterns are to be selected randomly. As an outcome, the system produces the matrix of linear regressio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-87240-3_31